sinkhorn potential
A Preliminaries on the Sinkhorn Potentials Lemma A.1 (Lemma A.2 elaborated)
One can observe that the Sinkhorn potentials are not unique. Under Assumption A.2, for a fixed pair of Assume Assumptions A.2 and A.3, and denote In this section, we provide several lemmas to show the Lipschitz continuity (w.r.t. the underlying These lemmas will be used in the convergence analysis and the mean field analysis for SD . Please see the proof in Appendix C.3. It would be an interesting future work to remove this factor . W e note that the Lemma B.1 is strictly stronger than preexisting results: (1) Proposition 13 of Feydy et al. [2019] only shows that the dual potentials are continuous (not Lipschitz continuous) This is strictly weaker than (i) of Lemma B.1.
Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm
Giulia Luise, Saverio Salzo, Massimiliano Pontil, Carlo Ciliberto
We present a novel algorithm to estimate the barycenter of arbitrary probability distributions with respect to the Sinkhorn divergence. Based on a Frank-Wolfe optimization strategy, our approach proceeds by populating the support of the barycenter incrementally, without requiring any pre-allocation.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Texas (0.04)
- (6 more...)
Parameter tuning and model selection in Optimal Transport with semi-dual Brenier formulation
Over the past few years, numerous computational models have been developed to solve Optimal Transport (OT) in a stochastic setting, where distributions are represented by samples and where the goal is to find the closest map to the ground truth OT map, unknown in practical settings. So far, no quantitative criterion has yet been put forward to tune the parameters of these models and select maps that best approximate the ground truth. To perform this task, we propose to leverage the Brenier formulation of OT. Theoretically, we show that this formulation guarantees that, up to sharp a distortion parameter depending on the smoothness/strong convexity and a statistical deviation term, the selected map achieves the lowest quadratic error to the ground truth. This criterion, estimated via convex optimization, enables parameter tuning and model selection among entropic regularization of OT, input convex neural networks and smooth and strongly convex nearest-Brenier (SSNB) models. We also use this criterion to question the use of OT in Domain-Adaptation (DA). In a standard DA experiment, it enables us to identify the potential that is closest to the true OT map between the source and the target. Y et, we observe that this selected potential is far from being the one that performs best for the downstream transfer classification task.
- North America > United States > Pennsylvania (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm
Giulia Luise, Saverio Salzo, Massimiliano Pontil, Carlo Ciliberto
We present a novel algorithm to estimate the barycenter of arbitrary probability distributions with respect to the Sinkhorn divergence. Based on a Frank-Wolfe optimization strategy, our approach proceeds by populating the support of the barycenter incrementally, without requiring any pre-allocation.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Texas (0.04)
- (6 more...)
A Preliminaries on the Sinkhorn Potentials Lemma A.1 (Lemma A.2 elaborated)
One can observe that the Sinkhorn potentials are not unique. Under Assumption A.2, for a fixed pair of Assume Assumptions A.2 and A.3, and denote In this section, we provide several lemmas to show the Lipschitz continuity (w.r.t. the underlying These lemmas will be used in the convergence analysis and the mean field analysis for SD . Please see the proof in Appendix C.3. It would be an interesting future work to remove this factor . W e note that the Lemma B.1 is strictly stronger than preexisting results: (1) Proposition 13 of Feydy et al. [2019] only shows that the dual potentials are continuous (not Lipschitz continuous) This is strictly weaker than (i) of Lemma B.1.
- North America > United States > Pennsylvania (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Sinkhorn Natural Gradient for Generative Models
Shen, Zebang, Wang, Zhenfu, Ribeiro, Alejandro, Hassani, Hamed
We consider the problem of minimizing a functional over a parametric family of probability measures, where the parameterization is characterized via a push-forward structure. An important application of this problem is in training generative adversarial networks. In this regard, we propose a novel Sinkhorn Natural Gradient (SiNG) algorithm which acts as a steepest descent method on the probability space endowed with the Sinkhorn divergence. We show that the Sinkhorn information matrix (SIM), a key component of SiNG, has an explicit expression and can be evaluated accurately in complexity that scales logarithmically with respect to the desired accuracy. This is in sharp contrast to existing natural gradient methods that can only be carried out approximately. Moreover, in practical applications when only Monte-Carlo type integration is available, we design an empirical estimator for SIM and provide the stability analysis. In our experiments, we quantitatively compare SiNG with state-of-the-art SGD-type solvers on generative tasks to demonstrate its efficiency and efficacy of our method.
- North America > United States > Pennsylvania (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
Sinkhorn Barycenter via Functional Gradient Descent
Shen, Zebang, Wang, Zhenfu, Ribeiro, Alejandro, Hassani, Hamed
In this paper, we consider the problem of computing the barycenter of a set of probability distributions under the Sinkhorn divergence. This problem has recently found applications across various domains, including graphics, learning, and vision, as it provides a meaningful mechanism to aggregate knowledge. Unlike previous approaches which directly operate in the space of probability measures, we recast the Sinkhorn barycenter problem as an instance of unconstrained functional optimization and develop a novel functional gradient descent method named Sinkhorn Descent (SD). We prove that SD converges to a stationary point at a sublinear rate, and under reasonable assumptions, we further show that it asymptotically finds a global minimizer of the Sinkhorn barycenter problem. Moreover, by providing a mean-field analysis, we show that SD preserves the weak convergence of empirical measures. Importantly, the computational complexity of SD scales linearly in the dimension $d$ and we demonstrate its scalability by solving a $100$-dimensional Sinkhorn barycenter problem.
- North America > United States > Pennsylvania (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)